Using Punctuations and Lengths for Bilingual Sub-sentential Alignment
نویسندگان
چکیده
We present a new approach to aligning bilingual English and Chinese text at sub-sentential level by interleaving alphabetic texts and punctuations matches. With sub-sentential alignment, we expect to improve the effectiveness of alignment at word, chunk and phrase levels and provide finer grained and more reusable translation memory.
منابع مشابه
Interleaving Text and Punctuations for Bilingual Sub-sentential Alignment
We present a new approach to aligning bilingual English and Chinese text at sub-sentential level by interleaving alphabetic texts and punctuations matches. With sub-sentential alignment, we expect to improve the effectiveness of alignment at word, chunk and phrase levels and provide finer grained and more reusable translation memory.
متن کاملLinguistically-Based Sub-Sentential Alignment for Terminology Extraction from a Bilingual Automotive Corpus
We present a sub-sentential alignment system that links linguistically motivated phrases in parallel texts based on lexical correspondences and syntactic similarity. We compare the performance of our subsentential alignment system with different symmetrization heuristics that combine the GIZA++ alignments of both translation directions. We demonstrate that the aligned linguistically motivated p...
متن کاملAn Efficient Framework to Extract Parallel Units from Comparable Data
Since the quality of statistical machine translation (SMT) is heavily dependent upon the size and quality of training data, many approaches have been proposed for automatically mining bilingual text from comparable corpora. However, the existing solutions are restricted to extract either bilingual sentences or sub-sentential fragments. Instead, we present an efficient framework to extract both ...
متن کاملUsing Similarity Scoring to Improve the Bilingual Dictionary for Sub-sentential Alignment
We describe an approach to improve the bilingual cooccurrence dictionary that is used for word alignment, and evaluate the improved dictionary using a version of the Competitive Linking algorithm. We demonstrate a problem faced by the Competitive Linking algorithm and present an approach to ameliorate it. In particular, we rebuild the bilingual dictionary by clustering similar words in a langua...
متن کاملSampling-based Alignment and Hierarchical Sub-sentential Alignment in Chinese-Japanese Translation of Patents
This paper describes Chinese–Japanese translation systems based on different alignment methods using the JPO corpus and our submission (ID: WASUIPS) to the subtask of the 2015 Workshop on Asian Translation. One of the alignment methods used is bilingual hierarchical sub-sentential alignment combined with sampling-based multilingual alignment. We also accelerated this method and in this paper, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003